Iterative Unsupervised GMM Training for Speaker Indexing
نویسندگان
چکیده
The paper addresses a novel algorithm for speaker searching and indexation based on unsupervised GMM training. The proposed method doesn’t require a predefined set of generic background models, and the GMM speaker models are trained only from test samples. The constrain of the method is that the number of the speakers has to be known in advance. The results of initial experiments show that the proposed training method enables to create precise GMM speaker models from only a small amount of training data.
منابع مشابه
Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation
This paper addresses unsupervised speaker indexing for discussion audio archives. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the Bayesian Information Criterion (BIC) according to input utterances. The framework makes it possible to use a discrete model when the data is sparse, and to seamlessly switch to a continuous model after a large cluster is...
متن کاملUnsupervised speaker indexing using speaker model selection based on Bayesian information criterion
This paper addresses unsupervised speaker indexing for discussion audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and VarianceBIC (Bayesian Information Criterion) methods. We propose a flexible framework that select...
متن کاملGMM-derived features for effective unsupervised adaptation of deep neural network acoustic models
In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...
متن کاملJoint adaptation and adaptive training of TVWR for robust automatic speech recognition
Context-dependent Deep Neural Network has obtained consistent and significant improvements over the Gaussian Mixture Model (GMM) based systems for various speech recognition tasks. However, since DNN is discriminatively trained, it is more sensitive to label errors and is not reliable for unsupervised adaptation. Moreover, DNN parameters do not have a clear and meaningful interpretation, theref...
متن کاملA Study of Generic Models for Unsupervised On-line Speaker Indexing
On-line speaker indexing sequentially detects the points where a speaker identity changes in a multi-speaker audio stream, and classifies each speaker segment. This paper addresses two challenges: The first relates to monitoring which requires on-line processing. The second relates to the fact that the numberlidentity of the speakers is unknown. The indexing needs to be made in a unsupervised p...
متن کامل